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This paper developed two robust data-driven models, namely 
gene expression programming (GEP) and _ multivariate 
adaptive regression splines (MARS), for the estimation of 
the slump of concrete (SL). The main feature of the proposed 
data-driven methods is to provide explicit mathematical 
equations for estimating SL. The experimental data set 
contains five input variables, including the water-cement 
ratio (W/C), water (W), cement (C), river sand (Sa), and 
Bida Natural Gravel (BNG) used for the estimation of SL. 
Three common statistical indices, such as the correlation 
coefficient (R), root mean square error (RMSE), and mean 
absolute error (MAE), were used to evaluate the accuracy of 
the derived equations. The statistical indices revealed that the 
GEP formula (R=0.976, RMSE=19.143, and MAE=15.113) 
was more accurate than the MARS equation (R=0.962, 
RMSE=23.748, and MAE=16.795). However, the 
application of MARS, due to its simple regression equation 
for estimating SL, is more convenient for practical purposes 
than the complex formulation of GEP. 
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1. Introduction 


Concrete is an essential construction material that plays a crucial role in the development of 
infrastructure, providing strength and durability to buildings, bridges, and other structures [1]. 
Therefore, many studies have been performed on the properties of concrete using different 
conventional statistical models and data-driven methods due to the material's importance. The 
ease with which concrete may be blended, poured, compacted, and finished is referred to as its 
workability. A concrete mixture that is hard to mix and compact will add to the cost of 
management and result in inadequate strength, durability, and attractiveness. For producing high- 
quality concrete, the workability of concrete is a critical component that must be studied [2]. The 
slump test is frequently employed to assess the concrete's mechanical properties. Slump is an 
essential metric for gauging the consistency of concrete quality, significantly impacting the 
quality of civil engineering projects [3]. 


Data-driven methods are a particularly successful and reliable replacement compared to 
traditional regression analysis for complex systems whose objective is determining relationships 
between input and output variables [4-9]. Data-driven models are appropriate alternatives to 
conventional models and are widely used to model concrete properties [10—15]. Many scholars 
worldwide are interested in using data-driven techniques to evaluate concrete characteristics. The 
main shortcoming of traditional regression techniques is their inability to provide appropriate 
estimation results for complex problems [16,17]. The soft computing method exceeds the 
difficulties and drawbacks of regression analysis and provides astounding and precise findings. 


Data-driven models have become increasingly popular in the concrete industry, and many studies 
have utilized these models to improve various aspects of concrete production. For instance, 


Cao et al. [18] employed machine-learning techniques to estimate the porosity of high- 
performance concrete. Their research revealed that gradient-boosting trees outperformed random 
forests regarding prediction accuracy. In a study conducted by Golafshani et al. [19], a 
combination of Particle Swarm Optimization (PSO) and a fuzzy inference system was utilized to 
model the compressive strength (CS) of eco-friendly concrete. The work of Golafshani et al. [20] 
introduced robust modeling for approximating the CS of concrete employing an artificial neural 
network (ANN) and adaptive neuro-fuzzy inference system (ANFIS) techniques. The integration 
of Grey Wolf Optimizer (GWO) enhanced the performance of these models. Badawi et al. [21] 
proposed the application of ANN to predict the CS and slump of concrete by considering input 
parameters related to the concrete mix design. The developed ANN model, implemented using 
the MATLAB neural network toolbox, exhibited a strong correlation with experimental data. Soft 
computing approaches for estimating the CS and slump of concrete were discussed by Timur 
Cihan [22], who aimed to identify optimal techniques for normalization, regression, and feature 
selection to achieve accurate predictions. 


Tang et al. [23] introduced a hybrid machine learning model that combined Support Vector 
Regression (SVR) with Grid Search (GS) to accurately predict the CS of fly ash concrete. 
Through experimentation with 98 datasets, this hybrid model demonstrated its potential as an 
effective method for CS prediction, outperforming the stand-alone SVR model. Behnood and 
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Golafshani [24]employed the M5P algorithm to model the mechanical properties of concrete 
incorporating waste foundry sand (WFS). Rajakarunakaran et al. [25] proposed the use of 
machine learning-based regression approaches to estimate the CS of self-compacting concrete 
(SCC). 


Estimating the slump of concrete is a complex task that requires considering the nonlinear 
relationship between concrete constituents, such as water, cement, and aggregate. These 
materials have components that interact nonlinearly with the slump of concrete, making 
estimating the slump a challenging task. Moreover, the slump of concrete is an essential property 
that influences the workability and performance of concrete, making it a critical factor in 
construction. To address this challenge, some investigations have used data-driven models for the 
prediction of the slump of the concrete. For instance, Chine et al. [26] utilized multiple linear 
regression (MLR) and ANN models to predict the slump of concrete. Their findings suggested 
that ANN was more accurate than MLR in an approximating slump for different grades of 
concrete. 


Similarly, Agrawal and Sharma 2010 [27] combined ANN and genetic algorithms (GA) to 
estimate the slump of concrete. They showed that their combined model improved the prediction 
accuracy of the slump compared to the stand-alone ANN model. Rezaie and Sadighi [28] 
generated a linear regression analysis and an adaptive neuro-fuzzy inference system (ANFIS) to 
predict the slump of lightweight aggregate concrete. Their results indicated that ANFIS was more 
accurate than linear regression in predicting the slump of concrete. Islam et al. [29] developed 
statistical analysis and regression models to estimate the slump of concrete incorporating rice 
husk ash (RHA) and laboratory results. Furthermore, Onikeku et al. [30] employed both ANN 
and MLR approaches to predict the slump of concrete containing two blended agro-waste 
materials and achieved acceptable results. Oztas et al. [31] demonstrated the effectiveness of 
ANN in predicting the slump values of high-strength concrete. Singh et al. [32] developed an 
ANN model for determining the slump of concrete using laboratory tests. Yeh [33] showed that 
ANN outperformed traditional regression methods in predicting the slump of concrete. 


Recently, Yusuf et al. [34] developed various ANN models with different numbers of hidden 
nodes to predict the slump of concrete. They evaluated the performance of ANN and MLR using 
statistical measures and created several ANN models for the estimation of the slump. Their 
results indicated that the ANN model with twenty hidden nodes had the best performance in 
predicting the slump, and an MLR equation was also obtained to predict the slump. 


Recent studies have demonstrated numerous applications of black-box methods, such as ANN, 
for predicting the characteristics of concrete, particularly in estimating concrete slumps. 
However, black box models suffer from the limitation of not providing explicit relationships 
among the variables involved in a complex problem [35-37]. A review of previous studies has 
revealed limited utilization of the GEP and MARS methods in concrete slump estimation, despite 
their capability of offering mathematical relationships. Applying GEP and MARS methods has 
been relatively restricted compared to black box methods. Nonetheless, these methods have 
showcased their ability to estimate complex parameters accurately. Moreover, the relationships 
these models provide can be easily applied by engineers in practical applications. 
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This study employed two powerful data-driven methods, gene expression programming (GEP) 
and multivariate adaptive regression splines (MARS), to predict the slump of concrete. GEP and 
MARS are highly effective techniques for simulating complex processes, as they can represent 
intricate input-output relationships without requiring prior knowledge of the phenomenon. The 
primary objective of this study is to develop explicit models for predicting SL. The proposed 
GEP and MARS models provide mathematical equations that can be used to predict the slump of 
concrete. 


2. Material and methods 


2.1. Data samples 


The data samples used for developing data-driven models were obtained from a previous study 
by Yusuf et al. [34]. Yusuf et al. [34] conducted experimental works for the prediction of SL. 
They used Bida natural gravel (BNG) as coarse aggregate in concrete mixes and developed ANN 
models and MLR equations to predict SL. The main factors investigated by Yusuf et al. [34] that 
influenced concrete slump (SL) were the water-cement ratio (W/C), water (W), cement (C), 
sand (Sa), and Bida natural gravel (BNG). They examined 36 concrete mixes for the 
measurement of (SL). Therefore, the functional relationship form below was considered for 
modeling SL. 


SL = f(W/C,W,C, Sa, BNG) (1) 


Table 1 provides the main statistical parameters for the prediction of SL. 


Table 1 
The main statistical parameters for the prediction of SL [34]. 
Parameter Category Min Max Average 
W/C 0.40 0.60 0.50 
W (kg/m) 129.07 283.72 194.10 
C(kg/m) Inputs 303.02 523.28 390.54 
Sa(kg/m?) 496.50 1023.16 703.30 
BNG(kg/m?) 778.77 1262.00 1011.85 
SL(mm) Output 0 270 67.88 


2.2. Overview of GEP and modeling SL 


The gene expression programming (GEP) algorithm is a computational algorithm and a member 
of the extended genetic algorithms and genetic programming [38]. The results of GEP are 
computer programs developed by modifying their sizes, forms, and compositions to produce 
complicated tree structures [39]. The GEP algorithm employs linear chromosomes composed of 
genes generally structured in a head and a tail. 
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GEP comprises a combination of five major elements, including the fitness function’s definition, 
the terminal set and mathematical functions definition, the determination of chromosome 
structures such as the number of genes, determining the linking function and the control 
parameters, and the stop criterion [40]. The results of GEP are expressed in the form of tree-like 
structures (namely, sub-expression trees (sub-ETs)). Moreover, GEP includes a unique multi- 
genic feature that enables the evolution of more complicated programs with numerous sub- 
programs [41]. There is a collection of fixed-length symbols for each GEP gene, and these 
symbols can represent any component of the function and terminal sets. The function set may 
include any user-defined function or the basic mathematical operators (+, —, X, and /). 


Each gene in the GEP comprises mathematical operators, variables, and constants used to encode 
a mathematical formula. The GEP parameters used to calculate the slump of concrete are listed in 
Table 2. The outcome of the GEP model is shown in Fig. 1. For modeling SL, the input variables, 
including (W/C), (W), (C), (Sa), and (BNG) were used. Fig. 1 indicates the Sub-ETs were 
obtained for estimating SL with GEP implementation. 


For modeling slump concrete with the GEP approach, the Gene Xpro Tools software was 
utilized. The setting parameter values of the GEP model and genetic operators, such as mutation, 
inversion, and transportation, are displayed in Table 2. Ebtehaj and Bonakdari [38] suggested 
that a population size between 30 and 100 can yield satisfactory results. Therefore, this 
investigation employed 50 chromosomes, selected by trial and error. The generation number was 
obtained at 300000, and RMSE fitness functions for GEP model development were selected via 
trial and error. In addition, the function set was considered 
+,—,x,+, Exp, Ln, x?,x3,V_, V_,Sin, Cos and Atan. It is worth mentioning that the RMSE 
fitness function has the best performance in similar studies [8,42,43]. Previous studies found 
favorable outcomes using the additional function as a linking function between sub-ETs. 


Te parameters of the GEP model for the estimation of the slump of concrete. 
Parameter Value 
Number of genes 3 
Mutation rate 0.044 
One-point recombination 0.3 
Two-point recombination 0.3 
IS transposition 0.1 
RIS transposition 0.1 
Gene recombination 0.1 
Gene transposition 0.1 


It is worth mentioning the setting parameter of GEP is based on previous studies, and trial and 
error processes were obtained. 
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Sub-ET 1 


Sub-ET 2 


Sub-ET 3 


Fig. 1. Sub-ETs obtained from GEP for prediction of SL. 


The explicit equations related to Fig. 1 are as follows: 
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Sub — ET 1 = (Sin(In(d, x d3) — Jda)) x c? 
Sub — ET 2 = (dy x d3) — Sin(,/d,) x (c,/ds) (2) 
The value of SL was obtained as follows: 


SL = Sub — ET 1 + Sub — ET 2 + Sub — ET 3 (3) 


where the values of c, is Sub-ET 1, Sub-ET 2, and Sub-ET 3 are -8.966126, -8.851715, and 
6.780976, respectively. In addition, the variables of dy , d, ,d3 and d, are BNG, C , W/C and 
W, respectively. 


2.3. Overview of MARS and modeling SL 


Multivariate adaptive regression splines (MARS) is a famous data-driven model commonly used 
in the civil engineering field with success. MARS is a robust non-parametric regression method 
that can model complex relationships between dependent and independent variables. The 
algorithm generates a regression model that predicts the dependent variable based on several 
independent variables using piecewise regression parts (called basis functions) [44]. 


The MARS algorithm constructs a piecewise linear regression model by dividing the 
independent variables into smaller subregions and fitting simple basis functions to each 
subregion [45]. The algorithm selects the appropriate basis function and independent variables 
for each subregion and determines the breakpoints or knots that define the boundaries of each 
subregion. 


In the MARS method, the basis function is crucial in capturing the underlying relationships 
between the input and target variables. MARS utilizes a set of basis functions defined as 
piecewise linear segments. The algorithm starts with a simple model consisting of a constant 
term and gradually adds basis functions to capture non-linearities and interactions. At each step, 
the algorithm assesses the contribution of potential basis functions using a statistical criterion, 
such as the generalized cross-validation (GCV) score. 


The equation of MARS and related basis functions are as follows [46]: 


SL = By + Lids BmBEn (x) (4) 
BF (x) = max(0,c — x) (5) 
or 

BF,,(x) = max(0,x — c) (6) 


where, fo is the constant value and, £,, is the corresponding coefficient of BF. BF, (x) is the m™ 


basis function, x is the input variable, and c is the threshold value of the input variable. 


The MARS method provided an explicit equation as follows for the prediction of SL: 


SL = 31.6395 — 0.733976 x BF, — 652.467 X BF, — 4.49362 X BF, + 7.2058 x BF, — 
0.817229 x BFs (7) 
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where, BF, = max(0,181.66 —W), BF, = max(0,W/C — 0.55),BF; = max(0,W — 209.41), 
BF, = max(0,W — 192.04) and BF; = max(0,C — 383.84). 
The GCV value of the proposed MARS equation was 1414.98. 


3. Results and discussions 


Common statistical measures, such as root mean squared error (RMSE), correlation coefficient 
(R), and mean absolute error (MAE), are used to evaluate the accuracy of the proposed 
algorithms. These statistical measures are as follows: 


n Sumpey2 
RMSE = iar Ai7yi)* (8) 
n 


Ye i-X)O7i-¥) 


os (9) 
onsGi-2? [2h o- 


dL. 
MAE = ~ Yin |x — vil (10) 


where x and x are observed, and the mean value of SL. In addition y and y are predicted and the 
mean value of y, respectively. n is the total number of data. The statistical measurements for the 
training and testing data set are tabulated in Table 3. 


Ree values of the GEP and MARS models for estimation of SL for training and testing datasets. 
Approach RMSE R MAE 
GEP (Train) 18.553 0.978 14.011 
GEP (Test) 19.366 0.975 15.537 
MARS (Train) 19.744 0.974 16.127 
MARS (Test) 25.118 0.957 17.051 


The two data-driven models, GEP and MARS, were used to predict the slump of the concrete. 
Both models were developed using training data, and their accuracy and performance were 
evaluated using testing data. As seen in Table 3, the GEP model achieved an RMSE of 18.553, an 
R of 0.978, and an MAE of 14.011 when developed on the training data. In addition, when 
evaluated on the testing data, the GEP model achieved an RMSE of 19.366, an R of 0.975, and 
an MAE of 15.537. On the other hand, the MARS model achieved an RMSE of 19.744, an R of 
0.974, and an MAE of 16.127 when developed on the training data. Moreover, when assessed on 
the testing data, the MARS model achieved an RMSE of 25.118, an R of 0.957, and an MAE of 
17.051. 


Comparing the results of the proposed models, it can be seen that the GEP model performed 
better than the MARS model in terms of RMSE and MAE for both training and testing data. The 
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correlation coefficient (R) for both models is relatively high, indicating a strong linear 
relationship between the predicted and observed values. It is important to note that while the 
GEP model performed better overall, the more complex structure of GEP was obtained for the 
prediction of SL compared to the simple equation provided by the MARS model. 


A more in-depth analysis can provide valuable insights into their implications for civil 
engineering applications when comparing the performance of the GEP and MARS models in 
terms of error indicators, including RMSE and MAE. RMSE and MAE are commonly used error 
metrics to assess the performance of data-driven methods. A lower value of RMSE and MAE 
indicates better agreement between the model predictions and the actual observed values. RMSE 
represents the average magnitude of the prediction errors made by the model. A lower RMSE 
indicates that the model's predictions are closer to the actual observed values. MAE measures the 
average magnitude of the absolute differences between the model predictions and the observed 
values. It provides a similar interpretation as RMSE but on an absolute scale. Like RMSE, a 
lower MAE indicates better agreement between the model predictions and the observed values. 


In the context of predicting slump concrete, the GEP model demonstrated lower RMSE and 
MAE values compared to MARS during both the training and testing stages. These results lead 
to the conclusion that the GEP model outperforms MARS in terms of accuracy for slump 
concrete prediction. The superior accuracy of GEP over MARS in predicting concrete slumps 
holds promising practical implications for civil engineering applications. The more accurate 
predictions obtained from GEP can provide engineers and construction professionals with 
reliable information regarding the workability and consistency of concrete mixes. This, in turn, 
enables better planning and optimization of construction processes, leading to improved quality 
control, cost efficiency, and overall project performance. However, it should be noted that GEP 
provides a complex equation for predicting the slump of concrete. While the complex equation 
may offer higher accuracy, it may also pose challenges in terms of interpretation and 
implementation in real-world scenarios. 


On the other hand, although MARS may exhibit lower accuracy compared to GEP in predicting 
concrete slumps, it still holds practical merits in civil engineering applications. MARS can 
provide interpretable models, allowing engineers to gain insights into the relationships between 
variables. Its ability to capture interactions between predictors affecting the slump of concrete. 
Moreover, MARS provided a more straightforward equation with less complexity than the GEP 
equation for predicting the slump (SL). This simplicity can be advantageous regarding model 
interpretation and computational efficiency, especially in cases with limited data or quick 
exploratory analyses. 


In summary, the superior accuracy of GEP in predicting concrete slumps offers significant 
practical implications for civil engineering applications. However, the complexity of the GEP 
equation should be considered, as it may affect interpretation and implementation. Despite its 
lower accuracy, MARS provided interpretable models with simpler equations, making it a viable 
option for gaining insights into variable relationships and conducting efficient analyses in 
concrete slump prediction. 
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The Flowchart of the present study can be summarized in Fig 2. 


Experimental data set 


70% 30% 


Training data 


Training the white-box data-driven 
models 


Testing data 


Performance evaluation 
using R, RMSE, and MAE 


Select the best model 


Fig. 2. Flowchart of the present study for the preparation of the slump of concrete. 


Moreover, the training and testing data's graphical representations are shown in Figs. 3-6. 
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GEP (Training data) 


Data Samples 


—@— Observed - @- GEP 


Fig. 3. GEP results for prediction of SZ using training data. 


GEP (Testing data) 


Data Samples 


—@O— Observed - @- GEP 


Fig. 4. GEP results for prediction of SL using testing data. 
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MARS (Training data) 


Data Samples 


—®O@— Observed - @- MARS 


Fig. 5. MARS results for prediction of SL using training data. 


MARS (Testing data) 


Data Samples 


—®— Observed -@- MARS 
Fig. 6. MARS results for prediction of SL using testing data. 


These figures indicated that the GEP model better captured the complex relationships between 
independent variables for the prediction of SL. The testing data results for the GEP model had 
lower prediction errors and were closer to the observed values than the MARS model. The 
training data results also showed that the GEP model had better fit and generalization ability, as it 
closely followed the observed values and had fewer prediction errors. In addition, for more 
comparison, the study's results were compared for all data sets with the MLR model proposed by 
the earlier study [34]. Table 4 provides the values of statistical measures for the GEP, MAR, and 
MLR models for the prediction of SL. 
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Ri values of the proposed models for estimation of SL for all data sets. 
Approach RMSE R MAE 
GEP 19.143 0.976 15.113 
MARS 23.748 0.962 16.795 
MLR [34] 29.417 0.942 23.309 


The results showed that GEP outperformed MARS in terms of accuracy for predicting the slump 
in concrete. The statistical measures of GEP, including RMSE, R, and MAE, were 19.143, 0.976, 
and 15.113, respectively. On the other hand, the statistical measures of MARS, including RMSE, 
R, and MAE, were 23.748, 0.962, and 16.795, respectively. It is important to note that both 
algorithms performed well in predicting the SL, as indicated by the high correlation coefficients 
(R) and low RMSE and MAE values. However, GEP demonstrated a higher level of accuracy, as 
reflected in the lower values of the other statistical measures. 


In addition, comparing the results with the MLR model with RMSE = 29.417, R = 0.942, and 
MAE = 23.309 revealed the highest performance of the proposed data-driven models, GEP and 
MARS, for the prediction of SL. It is worth mentioning that compared to the black-box model 
(i.e., the ANN model), the explicit mathematical expression was proposed for the prediction of 
SL. Furthermore, the values of R=0.98 for the ANN model by Yusuf et al. [34] compared to the R 
values of GEP (R=0.976) and MARS (R=0.962) indicated the acceptable performance of GEP 
and MARS as powerful data-driven models for prediction of SL. 


Therefore, it was concluded that the proposed white-box data-driven models were more accurate 
than the traditional regression approach. In fact, traditional regression techniques face inherent 
limitations when accurately estimating complex problems related to concrete properties. These 
limitations arise due to the linear nature of traditional regression models, which struggle to 
capture the nonlinear relationships and interactions present in such complex systems. To address 
these limitations, researchers have turned to data-driven models that offer greater flexibility and 
adaptability in capturing complex patterns. GEP and MARS are two such data-driven approaches 
that have shown promise in overcoming the limitations of traditional regression techniques. 


GEP is an evolutionary-based, data-driven model that can automatically evolve mathematical 
expressions to model complex relationships. By incorporating nonlinear functions and 
interactions, GEP enables a more accurate estimation of concrete properties compared to 
traditional regression techniques. Similarly, MARS is a flexible data-driven model that can 
capture nonlinear relationships and interactions using a piecewise regression approach. By 
adaptively partitioning the data and fitting regression models within each partition, MARS 
provides a more robust estimation of complex concrete properties. 


It is worth mentioning that the mathematical equations presented by the GEP and MARS 
methods can be beneficial for civil engineers in estimating slump concrete. Civil engineers can 
use the proposed equations to estimate the slump of concrete without knowing soft computing 
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methods. The presented mathematical equations do not require any special software for the 
estimation of the concrete slump. Moreover, Unlike the ANN method, the GEP model is regarded 
as a white-box, data-driven approach capable of establishing mathematical relationships among 
the relevant variables in the problem. Furthermore, in comparison to the MLR technique, GEP 
can generate a complex equation that accurately depicts the relationship between the influencing 
variables in the problem, facilitating the estimation of the concrete slump with greater precision. 


4. Summary and conclusions 


This study compared the accuracy and performance of two explicit data-driven algorithms, gene 
expression programming (GEP) and multivariate adaptive regression splines (MARS), for 
predicting the slump of concrete (SL). Statistical measures such as RMSE, MAE, and R were 
used to assess the method's accuracy. According to the evaluation metrics, the GEP method 
makes better predictions than MARS and the regression method. The outcomes demonstrated 
that the SZ values predicted by GEP and MARS could accurately estimate the SL. 


In conclusion, this study has demonstrated that GEP is a more accurate algorithm than MARS for 
predicting SL. The findings of this study may have important implications for the concrete 
industry, as accurate predictions of SL values can help optimize the production process and 
ensure the quality of the final product. The GEP model uses complex structures and complex 
equations to estimate concrete slumps. In contrast, the MARS model has used simple regression 
relationships to estimate concrete slumps. 


These findings suggest that GEP can be a more effective algorithm for predicting a slump in 
concrete. The higher accuracy of GEP can be attributed to its ability to model nonlinear 
relationships and interactions between variables to predict SL values. In contrast, MARS has a 
less complex formula for predicting SL and is more convenient to estimate SZ than the 
complicated GEP formula. 


Overall, the results of this study provide valuable insights into the application of data-driven 
methods for predicting SL values. The superior performance of GEP over MARS highlights the 
importance of selecting appropriate algorithms for accurate and reliable predictions. These 
findings can be useful for researchers and engineers working in the field of concrete technology, 
as well as in other fields where accurate prediction of complex systems is critical. This study 
considered two white-box data-driven models for concrete slump estimation, including the GEP 
and MARS models. To compare the performance of these two models, it is suggested for future 
work that the obtained results from the present study be compared with other white-box data- 
driven models that can provide mathematical equations for concrete slump estimation, such as 
decision trees (DTs) and group method of data handling (GMDH) approaches. 
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